-
Notifications
You must be signed in to change notification settings - Fork 1k
Row Name Extraction for data.table() with keep.rownames #7136
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Generated via commit 6dbc414 Download link for the artifact containing the test results: ↓ atime-results.zip
|
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #7136 +/- ##
=======================================
Coverage 98.50% 98.50%
=======================================
Files 81 81
Lines 15016 15032 +16
=======================================
+ Hits 14792 14808 +16
Misses 224 224 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Hey @Mukulyadav2004, please ping when tests are passing, or if you're stuck & need an extra pair of eyes. Thanks! |
|
Thanks @MichaelChirico ! These both tests are passing when I run on Rstudio but can't figure out why they are failing here. |
R/as.data.table.R
Outdated
| if (any(nzchar(valid_names))) { | ||
| vector_rownames = valid_names | ||
| x[[i]] = unname(xi) | ||
| break |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am leaning towards merging this logic into the below loop. We can write check_rownames = !isFALSE(keep.rownames) and then the marginal cost of checking if (check_rownames && ...) is low. We can replace the early break with checking is.null(vector_rownames). WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agree
inst/tests/tests.Rraw
Outdated
| x <- c(1, 2, 3) | ||
| y <- setNames(c(4, 5, 6), c("A", "B", "C")) | ||
| test(2329.1, as.data.table(list(x, y), keep.rownames=TRUE), data.table(rn=c("A", "B", "C"), V1=c(1, 2, 3), V2=c(4, 5, 6))) | ||
| test(2329.2, as.data.table(list(x, y), keep.rownames="custom"), data.table(custom=c("A", "B", "C"), V1=c(1, 2, 3), V2=c(4, 5, 6))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Let's make the test suite more extensive:
- Test also
list(y, x) - Test the behavior under
data.frame(), not justas.data.table - Test your condition about
any(nzchar(valid_names))
I also don't think we've matched data.frame behavior yet, c.f.
DF = data.frame(row.names = letters, V = 1:26)
head(rownames(data.frame(a = 26:1, DF)))
# [1] "a" "b" "c" "d" "e" "f"
You were affected by an underlying change in |
|
@Mukulyadav2004 I'm curious about the Observe that we already can produce M = cbind(1:3)
rownames(M) = rep("", 3L)
as.data.table(M, keep.rownames='blank')
# blank V1
# 1: 1
# 2: 2
# 3: 3Maybe we should drop that condition here too? |
…into issue_1916
|
I included the |
|
Thanks! The data.frame behavior makes sense, but yes, let's break from that. consistency within the data.table API is more important. |
|
learned a lot from your changes, thank you. |
| test(2330.5, as.data.table(data.frame(y, x), keep.rownames=TRUE), data.table(rn=c("A", "B", "C"), y=4:6, x=1:3)) | ||
|
|
||
| DF <- data.frame(row.names = letters[1:6], V = 1:6) # Test data.frame with explicit rownames | ||
| test(2330.6, as.data.table(list(a = 6:1, DF), keep.rownames=TRUE), data.table(rn=letters[1:6], a=6:1, V=1:6)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rather than remove the rest with all-empty names, let's test the expected behavior in that case as well.
Please also add a test of list(M) for empty-rowname'd matrix input
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's basically ready to go, please don't forget a NEWS entry and updated manual description
MichaelChirico
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, thank you! I'm going to start using this right away!!

fixes #1916
Implements row name extraction from named vectors in
data.table()andas.data.table()calls whenkeep.rownames=TRUEorkeep.rownames="column_name". This matches the behavior ofdata.frame()andas.data.frame.list()by extracting names from the first named atomic vector in the input.Problem : Currently,
data.table()andas.data.table()do not extract row names from named vectors likedata.frame().So added logic to
as.data.table.list()(the common path for bothdata.table()andas.data.table()) to:@MichaelChirico @tdhock can you please review.